READ_ME_SIMILARITIES_PROGRAM.TXT

   Ryan Bakker and Keith T. Poole (ktpoole@uga.edu)

***(CONTACT KEITH POOLE (ktpoole@uga.edu) IF YOU NEED HELP RUNNING THIS
CODE.  UPDATED CODE AND EXAMPLES WILL BE POSTED HERE:

http://voteview.com/Bakker_Poole_Bayesian_MDS.htm

WE WILL EVENTUALLY HAVE PROGRAMS THAT CAN BE RUN FROM WITHIN R WITH A
SIMPLE INTERFACE.  WE PLAN ON HAVING THIS DONE WITHIN 6 MONTHS.)***


1) The similarities/dissimilarities program that we use in our paper
"Bayesian Metric Multidimensional Scaling" is:

slice2_sen90.c

2) It is set up to read the file

sen90_distances3.dat

Which are the dissimilarites between members of the 90th Senate.  This
was computed from the file SEN90KH.ORD which is posted on
voteview.com.  The entries range between 0.00 to 2.0 (although the
maximum dissimilarity is less than 2.0).

3) The C program is set up to run on an iMAC OS X computer and it links
in the LAPACK and BLAS libraries from R.  The program also needs the
header files:

lbfgs.h  (Limited memory Broyden-Fletcher_Goldfarb-Shanno header file)
arithmetic_ansi.h (Used by L-BFGS)

4) Place the program, the data file, and the two header files in your working directory; e.g.,

#include </Users/keithtpoole/lbfgs.h>

Just search through the code using EMACS and you will see the header declarations.

5) To compile, link, and load, use the command:

gcc -o slice2_sen90 slice2_sen90.c -lcblas -lclapack

To run the program you need to have R installed and the GNU Compilers
installed.  Instructions for installing the GNU Compilers are here:

http://voteview.com/measure_Install_RTools.htm

6) The program opens a number of files.  Below are code fragments from
the C program showing these open statements: 

6A)  This File Contains some miscellaneous diagnostic output

jp =fopen("data_signs_s90.txt","w"); 

6B) This File has the slice sampler output -- All the Configurations
in the Markov Chain (the first 10,000 are burn-in -- This is set 
below with the variable nburn): 

kp = fopen("slice2_signs_s90.txt","w"); 

The 3rd line from the bottom of the file are the means of the 100,000
trials after burn-in.  The 2nd line from the bottom of the file are
the target coordinates (produced by LBFGS) for the rotation of each
configuration in the chain.  The last line of the file are the
standard deviations of the coordinate means.  For two dimensions, note
that the coordinate means start in the sixth column.  The estimate of
sigma-squared is the next to the last number in the row of means and
its standard deviation is the number directly below in the last row of
the file.

6C) The Dissimilarities Data Being Read for the 90th Senate:

if((fp =fopen("sen90_distances3.dat","r"))==NULL)

We do a fail-safe check on these dissimilarities with this block of code:

/*
DO TRANSFORMATION TO DISTANCES HERE
*/
  for(i=0;i<nrowX;i++)
  {
	  for(j=0;j<ncolX;j++)
	  {
		  if(XREAD[i+j*nrowX] >= 0)
		  {
			  X[i+j*nrowX] = XREAD[i+j*nrowX];
		  }
		  if(XREAD[i+j*nrowX] < 0)
		  {
			  X[i+j*nrowX] = -999;
		  }
	  }
  }
Note that this places -999.0 in the vector X[.] for missing data.

7) Below are the key variables that have to be set to read a
dissimilarities matrix besides the 90th Senate.  NS is the number of
dimensions.  N is used in the L-BFGS routine and the formula is:

N=(nrowX*NS)-(NS*(NS+1)/2).  

NDIM is used in the slice sampler and the formula is:

NDIM=(nrowX-1)*NS + 1

With dissimilarities data nrowX=ncolX

Finally, SIGMAPRIOR is the prior on SIGMA-SQUARED for the Log-Normal.

//
#define NS 2                    /*  Number of Dimensions */
#define N 201  /* 201 if NS=2, 300 if NS=3, 398 if NS=4, USED IN L-BFGS ROUTINE -- SET EQUAL to (nrowX*NS)-3 if NS=2; set equal to (nrowX*NS) - 6 if NS=3; set equal to (nrowX*NS) - 10 if NS=4*/
#define NDIM 203 /* 203 if NS=2, 304 if NS=3, 405 if NS=4*/                /* (nrowX-1)*NS+1 Number of Coordinates Being Estimated + Variance Term*/
#define nrowX 102                /*  */
#define ncolX 102
#define SIGMAPRIOR 100.0
